skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Yuille, Alan"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
  2. There are many realistic applications of activity recognition where the set of potential activity descriptions is combinatorially large. This makes end-to-end supervised training of a recognition system impractical as no training set is practically able to encompass the entire label set. In this paper, we present an approach to fine-grained recognition that models activities as compositions of dynamic action signatures. This compositional approach allows us to reframe fine-grained recognition as zero-shot activity recognition, where a detector is composed “on the fly” from simple first-principles state machines supported by deep-learned components. We evaluate our method on the Olympic Sports and UCF101 datasets, where our model establishes a new state of the art under multiple experimental paradigms. We also extend this method to form a unique framework for zero-shot joint segmentation and classification of activities in video and demonstrate the first results in zero-shot decoding of complex action sequences on a widely-used surgical dataset. Lastly, we show that we can use off-the-shelf object detectors to recognize activities in completely de-novo settings with no additional training. 
    more » « less
  3. null (Ed.)
  4. We present a basis approach to refine noisy 3D human pose sequences by jointly projecting them onto a non-linear pose manifold, which is represented by a number of basis dictionaries with each covering a small manifold region. We learn the dictionaries by jointly minimizing the distance between the original poses and their projections on the dictionaries, along with the temporal jittering of the projected poses. During testing, given a sequence of noisy poses which are probably off the manifold, we project them to the manifold using the same strategy as in training for refinement. We apply our approach to the monocular 3D pose estimation and the long term motion prediction tasks. The experimental results on the benchmark dataset shows the estimated 3D poses are notably improved in both tasks. In particular, the smoothness constraint helps generate more robust refinement results even when some poses in the original sequence have large errors. 
    more » « less